NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

End-to-End Learning Framework for Solving Non-Markovian Optimal Control

Zhang, Xiaole; Zhang, Peiyu; Xiao, Xiongye; Li, Shixuan; Tzoumas, Vasileios; Gupta, Vijay; Bogdan, Paul (August 2025, International Conference on Machine Learning (ICML))

Integer-order calculus fails to capture the long-range dependence (LRD) and memory effects found in many complex systems. Fractional calculus addresses these gaps through fractional-order integrals and derivatives, but fractional-order dynamical systems pose substantial challenges in system identification and optimal control tasks. In this paper, we theoretically derive the optimal control via linear quadratic regulator (LQR) for fractional-order linear time-invariant (FOLTI) systems and develop an end-to-end deep learning framework based on this theoretical foundation. Our approach establishes a rigorous mathematical model, derives analytical solutions, and incorporates deep learning to achieve data-driven optimal control of FOLTI systems. Our key contributions include: (i) proposing a novel method for system identification and optimal control strategy in FOLTI systems, (ii) developing the first end-to-end data-driven learning framework, Fractional-Order Learning for Optimal Control (FOLOC), that learns control policies from observed trajectories, and (iii) deriving theoretical bounds on the sample complexity for learning accurate control policies under fractional-order dynamics. Experimental results indicate that our method accurately approximates fractional-order system behaviors without relying on Gaussian noise assumptions, pointing to promising avenues for advanced optimal control.
more » « less
Free, publicly-accessible full text available August 1, 2026
End-to-End Learning Framework for Solving Non-Markovian Optimal Control

Zhang, Xiaole; Zhang, Peiyu; Xiao, Xiongye; Li, Shixuan; Tzoumas, Vasileios; Gupta, Vijay; Bogdan, Paul (July 2025, Proceedings of Machine Learning Research)

Free, publicly-accessible full text available July 16, 2026
A Structure-Aware Framework for Learning Device Placements on Computation Graphs

Duan, Shukai; Ping, Heng; Kanakaris, Nikos; Xiao, Xiongye; Kyriakis, Panagiotis; Ahmed, Nesreen K; Zhang, Peiyu; Ma, Guixiang; Capotă, Mihai; Nazarian, Shahin; et al (December 2024, NeurIPS)

Computation graphs are Directed Acyclic Graphs (DAGs) where the nodes correspond to mathematical operations and are used widely as abstractions in optimizations of neural networks. The device placement problem aims to identify optimal allocations of those nodes to a set of (potentially heterogeneous) devices. Existing approaches rely on two types of architectures known as grouper-placer and encoder-placer, respectively. In this work, we bridge the gap between encoder-placer and grouper-placer techniques and propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. It facilitates end-to-end training and takes into account the DAG nature of the computation graphs. We also propose a model variant, inspired by graph parsing networks and complex network analysis, enabling graph representation learning and jointed, personalized graph partitioning, using an unspecified number of groups. To train the entire framework, we use reinforcement learning using the execution time of the placement as a reward. We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models, namely Inception-V3, ResNet, and BERT. The robustness of the proposed framework is also highlighted through an ablation study. The suggested placements improve the inference speed for the benchmark models by up to over CPU execution and by up to compared to other commonly used baselines.
more » « less
Full Text Available
Discovering Malicious Signatures in Software from Structural Interactions

https://doi.org/10.1109/ICASSP48485.2024.10446565

Yin, Chenzhong; Zhang, Hantang; Cheng, Mingxi; Xiao, Xiongye; Chen, Xinghe; Ren, Xin; Bogdan, Paul (April 2024, IEEE)
Ko, Hanseok (Ed.)
Malware represents a significant security concern in today’s digital landscape, as it can destroy or disable operating systems, steal sensitive user information, and occupy valuable disk space. However, current malware detection methods, such as static-based and dynamic-based approaches, struggle to identify newly developed ("zero-day") malware and are limited by customized virtual machine (VM) environments. To overcome these limitations, we propose a novel malware detection approach that leverages deep learning, mathematical techniques, and network science. Our approach focuses on static and dynamic analysis and utilizes the Low-Level Virtual Machine (LLVM) to profile applications within a complex network. The generated network topologies are input into the GraphSAGE architecture to efficiently distinguish between benign and malicious software applications, with the operation names denoted as node features. Importantly, the GraphSAGE models analyze the network’s topological geometry to make predictions, enabling them to detect state-of-the-art malware and prevent potential damage during execution in a VM. To evaluate our approach, we conduct a study on a dataset comprising source code from 24,376 applications, specifically written in C/C++, sourced directly from widely-recognized malware and various types of benign software. The results show a high detection performance with an Area Under the Receiver Operating Characteristic Curve (AUROC) of 99.85%. Our approach marks a substantial improvement in malware detection, providing a notably more accurate and efficient solution when compared to current state-of-the-art malware detection methods. The code is released at https://github.com/HantangZhang/MGN.
more » « less
Full Text Available
Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning

Xiao, Xiongye; Liu, Gengshuo; Gupta, Gaurav; Cao, Defu; Li, Shixuan; Li, Yaxing; Fang, Tianqing; Cheng, Mingxi; Bogdan, Paul (May 2024, Twelfth International Conference on Learning Representations (ICLR))
Kim, Been (Ed.)
Integrating and processing information from various sources or modalities are critical for obtaining a comprehensive and accurate perception of the real world in autonomous systems and cyber-physical systems. Drawing inspiration from neuroscience, we develop the Information-Theoretic Hierarchical Perception (ITHP) model, which utilizes the concept of information bottleneck. Different from most traditional fusion models that incorporate all modalities identically in neural networks, our model designates a prime modality and regards the remaining modalities as detectors in the information pathway, serving to distill the flow of information. Our proposed perception model focuses on constructing an effective and compact information flow by achieving a balance between the minimization of mutual information between the latent state and the input modal state, and the maximization of mutual information between the latent states and the remaining modal states. This approach leads to compact latent state representations that retain relevant information while minimizing redundancy, thereby substantially enhancing the performance of multimodal representation learning. Experimental evaluations on the MUStARD, CMU-MOSI, and CMU-MOSEI datasets demonstrate that our model consistently distills crucial information in multimodal learning scenarios, outperforming state-of-the-art benchmarks. Remarkably, on the CMU-MOSI dataset, ITHP surpasses human-level performance in the multimodal sentiment binary classification task across all evaluation metrics (i.e., Binary Accuracy, F1 Score, Mean Absolute Error, and Pearson Correlation).
more » « less
Full Text Available
Unlocking Deep Learning: A BP-Free Approach for Parallel Block-Wise Training of Neural Networks

https://doi.org/10.1109/ICASSP48485.2024.10447377

Cheng, Anzhe; Ping, Heng; Wang, Zhenkun; Xiao, Xiongye; Yin, Chenzhong; Nazarian, Shahin; Cheng, Mingxi; Bogdan, Paul (April 2024, IEEE)
Ko, Hanseok (Ed.)
Backpropagation (BP) has been a successful optimization technique for deep learning models. However, its limitations, such as backward- and update-locking, and its biological implausibility, hinder the concurrent updating of layers and do not mimic the local learning processes observed in the human brain. To address these issues, recent research has suggested using local error signals to asynchronously train network blocks. However, this approach often involves extensive trial-and-error iterations to determine the best configuration for local training. This includes decisions on how to decouple network blocks and which auxiliary networks to use for each block. In our work, we introduce a novel BP-free approach: a block-wise BP-free (BWBPF) neural network that leverages local error signals to optimize distinct sub-neural networks separately, where the global loss is only responsible for updating the output layer. The local error signals used in the BP-free model can be computed in parallel, enabling a potential speed-up in the weight update process through parallel implementation. Our experimental results consistently show that this approach can identify transferable decoupled architectures for VGG and ResNet variations, outperforming models trained with end-to-end backpropagation and other state-of-the-art block-wise learning techniques on datasets such as CIFAR-10 and Tiny-ImageNet. The code is released at https://github.com/Belis0811/BWBPF.
more » « less
Full Text Available
NON-LINEAR OPERATOR APPROXIMATIONS FOR INITIAL VALUE PROBLEMS

Gupta, Gaurav; Xiao, Xiongye; Balan, Radu; Bogdan, Paul (April 2022, International Conference on Learning Representations (ICLR))

Time-evolution of partial differential equations is fundamental for modeling several complex dynamical processes and events forecasting, but the operators associated with such problems are non-linear. We propose a Pad´e approximation based exponential neural operator scheme for efficiently learning the map between a given initial condition and the activities at a later time. The multiwavelets bases are used for space discretization. By explicitly embedding the exponential operators in the model, we reduce the training parameters and make it more data-efficient which is essential in dealing with scarce and noisy real-world datasets. The Pad´e exponential operator uses a recurrent structure with shared parameters to model the non-linearity compared to recent neural operators that rely on using multiple linear operator layers in succession. We show theoretically that the gradients associated with the recurrent Pad´e network are bounded across the recurrent horizon. We perform experiments on non-linear systems such as Korteweg-de Vries (KdV) and Kuramoto–Sivashinsky (KS) equations to show that the proposed approach achieves the best performance and at the same time is data-efficient. We also show that urgent real-world problems like epidemic forecasting (for example, COVID- 19) can be formulated as a 2D time-varying operator problem. The proposed Pad´e exponential operators yield better prediction results (53% (52%) better MAE than best neural operator (non-neural operator deep learning model)) compared to state-of-the-art forecasting models.
more » « less
Full Text Available
Deciphering the generating rules and functionalities of complex networks

https://doi.org/10.1038/s41598-021-02203-4

Xiao, Xiongye; Chen, Hanlong; Bogdan, Paul (December 2021, Scientific Reports)

Abstract Network theory helps us understand, analyze, model, and design various complex systems. Complex networks encode the complex topology and structural interactions of various systems in nature. To mine the multiscale coupling, heterogeneity, and complexity of natural and technological systems, we need expressive and rigorous mathematical tools that can help us understand the growth, topology, dynamics, multiscale structures, and functionalities of complex networks and their interrelationships. Towards this end, we construct the node-based fractal dimension (NFD) and the node-based multifractal analysis (NMFA) framework to reveal the generating rules and quantify the scale-dependent topology and multifractal features of a dynamic complex network. We propose novel indicators for measuring the degree of complexity, heterogeneity, and asymmetry of network structures, as well as the structure distance between networks. This formalism provides new insights on learning the energy and phase transitions in the networked systems and can help us understand the multiple generating mechanisms governing the network evolution.
more » « less
Full Text Available
Generator based approach to analyze mutations in genomic datasets

https://doi.org/10.1038/s41598-021-00609-8

Jain, Siddharth; Xiao, Xiongye; Bogdan, Paul; Bruck, Jehoshua (December 2021, Scientific Reports)
null (Ed.)
Abstract In contrast to the conventional approach of directly comparing genomic sequences using sequence alignment tools, we propose a computational approach that performs comparisons between sequence generators. These sequence generators are learned via a data-driven approach that empirically computes the state machine generating the genomic sequence of interest. As the state machine based generator of the sequence is independent of the sequence length, it provides us with an efficient method to compute the statistical distance between large sets of genomic sequences. Moreover, our technique provides a fast and efficient method to cluster large datasets of genomic sequences, characterize their temporal and spatial evolution in a continuous manner, get insights into the locality sensitive information about the sequences without any need for alignment. Furthermore, we show that the technique can be used to detect local regions with mutation activity, which can then be applied to aid alignment techniques for the fast discovery of mutations. To demonstrate the efficacy of our technique on real genomic data, we cluster different strains of SARS-CoV-2 viral sequences, characterize their evolution and identify regions of the viral sequence with mutations.
more » « less
Full Text Available
Unifying structural descriptors for biological and bioinspired nanoscale complexes

https://doi.org/10.1038/s43588-022-00229-w

Cha, Minjeong; Emre, Emine Sumeyra; Xiao, Xiongye; Kim, Ji-Young; Bogdan, Paul; VanEpps, J. Scott; Violi, Angela; Kotov, Nicholas A. (April 2022, Nature Computational Science)

Full Text Available

« Prev Next »

Search for: All records